Method and system for network processor scheduling based on service levels

ABSTRACT

A system and method of moving information units from an output flow control toward a data transmission network in a prioritized sequence which accommodates several different levels of service. The present invention includes a method and system for scheduling the egress of processed information units (or frames) from a network processing unit according to service based on a weighted fair queue where position in the queue is adjusted after each service based on a weight factor and the length of frame, a process which provides a method for and system of interaction between different calendar types is used to provide minimum bandwidth, best effort bandwidth, weighted fair queuing service, best effort peak bandwidth, and maximum burst size specifications. The present invention permits different combinations of service that can be used to create different QoS specifications. The “base” services which are offered to a customer in the example described in this patent application are minimum bandwidth, best effort, peak and maximum burst size (or MBS), which may be combined as desired. For example, a user could specify minimum bandwidth plus best effort additional bandwidth and the system would provide this capability by putting the flow queue in both the NLS and WFQ calendar. The system includes tests when a flow queue is in multiple calendars to determine when it must come out.

CROSS REFERENCE TO RELATED PATENTS

[0001] This patent relates to and claims the benefit of ProvisionalPatent Application Ser. No. 60/196,831 filed Apr. 13, 2000.

[0002] The present invention is related to the following documents, allof which are assigned to the assignee of the present invention and whichare specifically incorporated herein by reference:

[0003] Patent application Ser. No. 09/384,691, filed Aug. 27, 1999 byBrian Bass et al., entitled “Network Processor Processing Complex andMethods”, sometimes referred to herein as the Network Processing UnitPatent or NPU Patent.

[0004] U.S. Pat. No. 5,724,348 entitled “Efficient Hardware/SoftwareInterface for a Data Switch” issued Mar. 3, 1998, which patent issometimes referred to herein as the Interface Patent.

[0005] Patent application Ser. No. 09/330,968 filed Jun. 11, 1999 andentitled “High Speed Parallel/Serial Link for Data Communications”,sometimes referred to as the Link Patent.

[0006] Various patents and applications assigned to IBM for itsmultiprotocol switching services, sometimes referred to as “MSS”, someof which include Cedric Alexander as an inventor, and are sometimesreferred to as the MSS Patents.

[0007] Patent application Ser. No. 09/548,907 (Docket RAL9-00-0010)filed Apr. 13, 2000 by Brian M. Bass et al. and entitled “Method andSystem for Network Processor Scheduler”. This patent is sometimesreferred to herein as the Scheduler Structure Patent.

[0008] Patent application Ser. No. 09/548,910 (Docket RAL9-00-0014)filed Apr. 13, 2000 by Brian M. Bass et al. and entitled “Method andSystem for Network Processor Scheduling Outputs Based on MultipleCalendars”. This patent is sometimes referred to herein as the CalendarScheduling Patent.

[0009] Patent application Ser. No. 09/548,911 (Docket RAL9-00-0015)filed Apr. 13, 2000 by Brian M. Bass et al. and entitled “Method andSystem for Network Processor Scheduling Based on Calculation”. Thispatent is sometimes referred to herein as the Calculation Patent.

[0010] Patent application Ser. No. 09/548,912 (Docket RAL9-00-0017)filed Apr. 13, 2000 by Brian M. Bass et al. and entitled “Method andSystem for Network Processor Scheduling Outputs Using Queuing”. Thispatent is sometimes referred to herein as the Queuing Patent.

[0011] Patent application Ser. No. 09/548,913 (Docket RAL9-00-0018)filed Apr. 13, 2000 by Brian M. Bass et al. and entitled “Method andSystem for Network Processor Scheduling Outputs usingDisconnect/Reconnect Flow Queues. This patent is sometimes referred toherein as the Reconnection Patent.

[0012] Patent application Ser. No. 09/546,651 (Docket RAL9-00-0007)filed April, 2000 by Peter I. A. Barri et al. and entitled “Method andSystem for Minimizing Congestion in a Network”. This patent is sometimesreferred to herein as the Flow Control Patent.

[0013] Patent application Ser. No. 09/547,280 (Docket RAL9-00-0004)filed Apr. 11, 2000 by M. Heddes et al. and entitled “Unified Method andSystem for Scheduling and Discarding Packets in Computer Networks”. Thispatent is sometimes referred to herein as the Packet Discard Patent.

BACKGROUND OF THE INVENTION

[0014] 1. Field of the Invention

[0015] The present invention relates to communication network apparatussuch as is used to link together information handling systems orcomputers of various types and capabilities and to components andmethods for data processing in such an apparatus. The present inventionincludes an improved system and method for scheduling the distributionof information units from a flow control system coupled to a pluralityof network processing unit toward a data transmission network through aPMM and MAC. More particularly, the present invention involvesscheduling using a plurality of algorithms to handle a plurality ofusers who are processing variable size information packets or frames,providing an order to the frames being provided from the flow controlsystem (which may be of the type described in the referenced FlowControl Patent) toward the data transmission network. The presentinvention includes a system for establishing and enforcing differenttypes of service levels for the flows of different users.

[0016] 2. Background Art

[0017] The description of the present invention which follows is basedon a presupposition that the reader has a basic knowledge of networkdata communications and the routers and switches which are useful insuch network communications. In particular, this description presupposesfamiliarity with the International Standards Organization (“ISO”) modelof network architecture which divides network operation into layers. Atypical architecture based on the ISO model extends from a Layer 1(which is sometimes referred to a “L1”) being the physical pathway ormedia through which signals are passed upward through Layers 2 (or“L2”), 3 (or “L3”), and so forth to Layer 7 which is the layer ofapplication programming resident in a computer system linked to thenetwork. Throughout this document, references to such layers as L1, L2,L3 are intended to refer to the corresponding layer of the networkarchitecture. The present description also is based on a fundamentalunderstanding of bit strings used in network communication known aspackets and frames.

[0018] Bandwidth considerations (or the amount of data which a systemcan handle in a unit of time) are becoming important in today's view ofnetwork operations. Traffic over networks is increasing, both in sheervolume and in the diversity of the traffic. At one time, some networkswere used primarily for a certain type of communications traffic, suchas voice on a telephone network and digital data over a datatransmission network. Of course, in addition to the voice signals, atelephone network would also carry a limited amount of “data” (such asthe calling number and the called number, for routing and billingpurposes), but the primary use for some networks had, at one point intime, been substantially homogenous packets.

[0019] A substantial increase in traffic has occurred as a result of theincreasing popularity of the Internet (a public network of looselylinked computers sometimes referred to as the worldwide web or “www.”)and internal analogs of it (sometimes referred to as intranets) found inprivate data transmission networks. The Internet and intranets involvetransmission of large amounts of information between remote locations tosatisfy an ever-growing need for remote access to information andemerging applications. The Internet has opened up to a large number ofusers in geographically dispersed areas an exploding amount of remoteinformation and enabled a variety of new applications, such ase-commerce, which has resulted in a greatly-increased load on networks.Other applications, such as e-mail, file transfer and database accessfurther add load to networks, some of which are already under strain dueto high levels of network traffic.

[0020] Voice and data traffic are also converging onto networks at thepresent time. Data is currently transmitted over the Internet (throughthe Internet Protocol or IP) at no charge, and voice traffic typicallyfollows the path of lowest cost. Technologies such as voice over IP(VOIP) and voice over asynchronous transfer mode or ATM (VoATM) or voiceover frame relay (VoFR) are cost-effective alternatives for transmissionof voice traffic in today's environment. As these services migrate, theindustry will be addressing issues such as the changing cost structureand concerns over the trade off between cost of service and quality ofservice in the transmission of information between processors.

[0021] Aspects of quality of service include the capacity or bandwidth(how much information can be accommodated in a period of time), theresponse time (how long does it take to process a frame) and howflexible is the processing (does it respond to different protocols andframe configurations, such as different encapsulation or frame headermethods). Those using a resource will consider the quality of service aswell as the cost of service, with the tradeoffs depending on thesituation presented.

[0022] Some prior art systems handle outgoing information units from aprocessing system in a variety of ways. One suggestion is to use a roundrobin scheduler which fairness amongst a set of queues. Another oneemploys several different levels of priorities and a queue for each. Insuch a system, you have an absolute priority where the highest prioritywork is processed first and the lowest priority work may never beprocessed.

[0023] Still another method of scheduling outputs involves a pluralityof prioritized lists of work to be processed.

[0024] It is also known to use a hierarchical packet scheduling system.There are even systems which use several different scheduling methods indetermining the order in which information units are to be sent toward adata transmission network, using a combination of different schedulingtechniques.

[0025] Other systems have used a weighted priority technique implementedin the form of a round robin—which serves all queues, with some queuesserved more frequently than other queues, based on an algorithm whichdefines the level of service. Even such a weighted priority system wouldprovide service to a user who continually exceeds the service levelsassigned to it, continuing to serve, albeit less often, even as itexceeds the assigned service level and making it difficult for thesystem to enforce a level of service policy.

[0026] Considering the size of a transmission packet or frame indetermining which customers to serve adds a measure of fairness to aservice system, in that a user who is processing large frames takes upmore of the system capacity and therefore should receive service lessoften than a user with small frames. Some of the prior art systemsconsider the size of the transmission packet or frame in allocatingresources, while others do not. Some communication systems use auniform, fixed-size packet, making consideration of packet sizeunnecessary, but others do not consider the size of the packet inallocating resources.

[0027] Other prior art system are directed to handling information unitswhich are of a common size as in the so-called Asynchronous TransferMode (or ATM) system, so that size of the information unit is notconsidered in determining the priority of the current or a futureinformation unit. An ATM system with a weight-driven scheduler is one ofthe solutions which is known in the prior art to schedule outputs froman ATM system.

[0028] In any such system which involves weighting and queueing, it isdesirable to allow for different types of service—for example, minimumbandwidth, best effort bandwidth, weighted fair queuing service, besteffort peak bandwidth, and maximum burst size. While each of these typesof service level are well known and accommodated in the prior art, it isa challenge to allow for the use of the any or all of them in the samesystem. It is also desirable to implement the weighted fair queuingusing a system which considers the size of the transmission packet indetermining the priority to be assigned to the packet in the queue.

[0029] Thus, the prior art systems for handling data packets fortransmission to a network have undesirable disadvantages and limitationswhich have an effect on the perceived fairness of the system.

SUMMARY OF THE INVENTION

[0030] The present invention overcomes the disadvantages and limitationsof the prior art systems by providing a simple, yet effective, way ofhandling information units or frames coming out of a processing systemand directing frames to output ports for dispatch to an datatransmission network while providing a variety of different type ofservice levels in the same system.

[0031] The present invention allows a single processing system toaccommodate users which have service level agreements which includecharacteristics such as minimum bandwidth, best effort bandwidth,weighted fair queuing service, best effort peak bandwidth, and maximumburst size specifications, and any combinations of these characteristicsin the same agreement.

[0032] The present invention has the advantage that it allows theefficient use of resources and requires a minimum overhead toaccommodate the various types of service levels. The present systemestablishes the types of service level agreement characteristics (alsoreferred to as QoS) and provides the mechanism for enforcing themthrough manipulation of flow queues within a combination of time basedcalendars and weighted fair queuing calendars. The present inventionalso uses a technique for enforcing a level of service characteristic(for example, a minimum bandwidth) by determining the earliest time forthe next service as a result of the current service, then testing thenext request for service to determine whether it is after the allowabletime for the next service based on the bandwidth established for theuser.

[0033] The present invention also allows for the use of any unusedbandwidth by others through the use of a weighted fair queuing systemwhich allows for the individual users to compete on a weighted fairbasis for bandwidth which is unused at any given time. That is, even ifbandwidth has been established for a user (for example, a user with aminimum bandwidth), when that bandwidth is not being used for that user,it may be used by others.

[0034] The system and method of the present invention allows for thefair use of the unused bandwidth by considering the size of the packetwhen determining the service order. That is, a user who sends a largepacket is serviced later in the queue for unused bandwidth than a userwho sends a small packet.

[0035] The method to accomplish different levels of service isaccomplished by establishing different calendars, both time based andweighted fair queuing and assigning flow queues to locations in one ormore calendars. The calendars selected are determined based on theservice level agreement which has been requested and paid for. Then, auser who has paid for a minimum bandwidth receives priority over otherswhile that user is operating within and does not exceed that bandwidth.To the extent that the user with a minimum bandwidth exceeds thatbandwidth, then the user may compete with other users for weighted fairuse bandwidth allocation according to his service level agreement usinga method which considers the length of the transmission packet.Similarly, a user who has arranged for a best effort bandwidth isprovided with that bandwidth to the extent that it is available and auser who has arranger for best effort peak bandwidth or maximum burstsize service is accorded the services in the system for allocatingbandwidth of the present invention.

[0036] Other objects and advantages of the present invention will beapparent to those skilled in the relevant art in view of the followingdescription of the preferred embodiment, taken together with theaccompanying drawings and the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0037] Having thus set forth some of the limitations and disadvantagesof the prior art and some objects and advantages of the presentinvention, other objects and advantages will be apparent to thoseskilled in the relevant art in view of the following description of thedrawings illustrating the present invention of an improved routingsystem and method in which:

[0038]FIG. 1 is a block diagram for an interface device includingembedded processor complex which is described in the NPU Patent, showinga DN Enqueue system and scheduler useful in practicing the presentinvention;

[0039]FIG. 2 is a block diagram of an embedded processor complex of typeshown in FIG. 1, with the DN Enqueue (and its included scheduler) usefulin understanding the present invention;

[0040]FIG. 3 illustrates the scheduler of FIGS. 1-2, illustrating asystem for scheduling egress of variable length packets according to thepreferred embodiment of the present invention, in an “egress scheduler”;

[0041]FIG. 4 illustrates timer base calendar according to the preferredembodiment;

[0042] FIGS. 5-8 illustrates the method and system for enqueuing packetsinto the scheduler system; and

[0043] FIGS. 9-13 are logic flow charts of the calculations performed inthe egress scheduler of the present invention, illustrating theservicing of a selected flow queue and calendar using the system of thepresent invention to provide minimum bandwidth, best effort bandwidth,weighted fair queuing service, best effort peak bandwidth and maximumburst size specifications, as different method of sharing bandwidthamong users.

[0044] DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

[0045] In the following description of the preferred embodiment, thebest implementations of practicing the invention presently known to theinventors will be described with some particularity. However, thisdescription is intended as a broad, general teaching of the concepts ofthe present invention in a specific embodiment but is not intended to belimiting the present invention to that as shown in this embodiment,especially since those skilled in the relevant art will recognize manyvariations and changes to the specific structure and operation shown anddescribed with respect to these figures.

[0046]FIG. 1 shows a block diagram of the interface device chip thatincludes the substrate 10 and a plurality of subassemblies integrated onthe substrate. The subassemblies are arranged into an upsideconfiguration and a downside configuration, with the “upside”configuration (sometimes also referred to as an “ingress”) referring tothose components relating to data inbound to the chip from a datatransmission network (up to or into the chip) and “downside” (sometimesreferred to as an “egress”) referring to those components whose functionis to transmit data from the chip toward the data transmission networkin an outbound fashion (away from the chip or down and into thenetwork). Data flows follow the respective arrangements of the upsideand downside configurations; thus, there is a upside data flow and adownside data flow in the system of FIG. 1. The upside or ingressconfiguration elements include an Enqueue-Dequeue-Scheduling UP (EDS-UP)logic 16, multiple multiplexed MAC's-UP (PMM-UP) 14, Switch DataMover-UP (SDM-UP) 18, System Interface (SIF) 20, Data Align Serial LinkA (DASL-A) 22 and Data Align Serial Link B (DASL-B) 24. Data links aremore fully described in the Link Patent referenced above, and referenceshould be made to that document for a greater understanding of thisportion of the system. It should be understood that the preferredembodiment of the present invention uses the data links as more fullydescribed in that patent, other systems can be used to advantage withthe present invention, particularly those which support relatively highdata flows and system requirements, since the present invention is notlimited to those specific auxiliary devices such as the data links whichare employed in the preferred embodiment.

[0047] The components depicted on the downside (or egress) of the systeminclude data links DASL-A 26 and DASL-B 28, system interface SIF 30,switch data mover SDM-DN 32, enqueue-dequeue-scheduler EDS-DN 34 andmultiple multiplexed MAC's for the egress PMM-DN 36. The substrate 10also includes a plurality of internal static random access memorycomponents (S-RAM's), a traffic management scheduler (TRAFFIC MGTSCHEDULER, also known as the Egress Scheduler) 40 and an embeddedprocessor complex 12 described in greater depth in the NPU Patentreferenced above. An interface device 38 is coupled by the respectiveDMU busses to PMM 14, 36. The interface device 38 could be any suitablehardware apparatus for connecting to the L1 circuitry, such as Ethernetphysical (ENET PHY) devices or asynchronous transfer mode framingequipment (ATM FRAMER), both of which are examples of devices which arewell known and generally available for this purpose in the trade. Thetype and size of the interface device are determined, at least in part,by the network media to which the present chip and its system areattached A plurality of external dynamic random access memory devices(D-RAMS) and a S-RAM are available for use by the chip.

[0048] While here particularly disclosed for networks in which thegeneral data flow outside the relevant switching and routing devices ispassed through electric conductors such as wires and cables installed inbuildings, the present invention contemplates that the network switchesand components thereof could be used in a wireless environment as well.For example, the media access control (MAC) elements herein disclosedmay be replaced with suitable radio frequency devices, such as thosemade from silicon germanium technology, which would result in theconnection of the device disclosed directly to a wireless network. Wheresuch technology is appropriately employed, the radio frequency elementscan be integrated into the VLSI structures disclosed herein by a personof skill in the appropriate arts. Alternatively, radio frequency orother wireless response devices such as infrared (IR) response devicescan be mounted on a blade with the other elements herein disclosed toachieve a switch apparatus which is useful with wireless networkapparatus.

[0049] The arrows show the general flow of data within the interfacesystem shown in FIG. 1. Frames of data or messages (also sometimesreferred to as packets or information units) received from an EthernetMAC 14 off the ENET PHY block 38 via the DMU bus are placed in internaldata store buffers 16 a by the EDS-UP device 16. The frames may beidentified as either normal frames or guided frames, which then relatesto method and location of the subsequent processing in the plurality ofprocessors. After the input units or frames are processed by one of theplurality of processors in the embedded processor complex, the completedinformation units are scheduled through the scheduler 40 out of theprocessing unit 10 and onto the data transmission network through thePMM-DN multiplexed MAC's 36 and the physical layer 38.

[0050]FIG. 2 is a block diagram of a processing system 100 which canemploy the present invention to advantage. In this FIG. 2, a pluralityof processing units 110 are located between a dispatcher unit 112 and acompletion unit 120. Each incoming frame F (from a switch, not shown,attached to the present data processing system) is received and storedinto an DOWN data store (or DN DS) 116, then sequenfially removed by thedispatcher 112 and assigned to one of the plurality of processing units110, based on a determination by the dispatcher 112 that the processingunit is available to process the frame. Greater detail on the structureand function of the processing units 110 in particular, and theprocessing system in general, can be found in the NPU Patent referencesabove and patent applications and descriptions of the individualcomponents such as a flow control device detailed in the Flow ControlPatent. Interposed between the dispatcher 112 and the plurality ofprocessing units 110 is a hardware classifier assist 118 which isdescribed in more detail in a pending patent application Ser. No.09/479,027 filed Jan. 7, 2000 by J. L. Calvignac et al. and assigned tothe assignee of the present invention, an application which isincorporated herein by reference. The frames which are processed by theplurality of network processors 110 go into a completion unit 120 whichis coupled to the DN Enqueue 34 through a flow control system asdescribed in the Flow Control Patent and the Packet Discard Patent. TheDN Enqueue 34 is coupled to the Dn Scheduler which is coupled throughthe PMM DN MAC's 36, then by the DMU data bus to the physical layer 38(the data transmission network itself).

[0051] The egress scheduler 40 of FIG. 3 provides a structure and methodof operation which permits the functions of scheduling frametransmission from a network processing unit to a data transmissionnetwork in accordance with a minimum bandwidth algorithm, peak bandwidthalgorithm, weighted fair queueing techniques and maximum burst sizescheduling in a single unified scheduler system. It is described morefully in the Scheduler Structure Patent referenced above.

[0052] The scheduler system illustrated in FIG. 3 is comprised of aplurality of flows 210, time-based calendars 220, 230, 250, weightedfair queuing (WFQ) calendars 240 and target port queues 260.

[0053] The flows 210 and their operation are described in more detail inthe referenced Flow Control Patent and the referenced Packet DiscardPatent. The flows 210 are control structures that are used to maintainordered lists of frames which share common system characteristics basedon assignment, that is, the level of service that the associated userhas selected and paid for. These characteristics include minimumbandwidth, peak bandwidth, best effort bandwidth and maximum burst sizequality of service (QoS) requirements. In addition to flow queues set upfor the purpose of supporting QoS for communication systems, thepreferred embodiment requires flow queues define for the purpose ofdiscarding frames (i.e. filtered traffic), and the wrapping of framedata from the egress to the ingress of the network processor system.

[0054] Time-based calendars 220, 230, 250 are used for schedulingpackets with minimum bandwidth and best effort peak rate requirements.As shown in FIG. 3, three time based calendars are used for thispurpose: two calendars 220, 230 for minimum bandwidth and a thirdcalendar 250 used to limit flow queues to a maximum best effort peakrate (peak bandwidth shaping). Two time-based calendars 220, 230 (onecalendar 220 identified as low latency service or LLS and the othercalendar 230 identified as normal latency service or NLS) provide forminimum bandwidth allow support of different classes of service within aminimum bandwidth QoS class (i.e., low latency and normal latency).

[0055] Weighted fair queuing (WFQ) calendars 240 are used for besteffort service, and best effort peak service (when used in combinationwith one of the time-based calendar 250). Further, the WFQ calendars 240support a queue weight that allows support of different classes ofservice within a best effort service QoS class. In the preferredembodiment there are 40 such WFQ calendars, corresponding to the numberof supported media ports (output ports). The selection of 40 such portsis a trade off between hardware cost and design complexity and is notintended to limit the scope of the invention.

[0056] In each of the above mentioned calendars, a pointer (a Flow ID)is used to represent a flow queue's location within the calendar. Thus,flow 0 has its Flow ID at 221 in calendar 220, flow 1 has its FlowID at232 in calendar 230 and at 241 in the WFQ calendar 240 and flow 2047 hasits FlowID at 231 in calendar 230 and at 251 in calendar 250, all asindicated by the arrows in FIG. 3. Further there may be none, one, ortwo such pointers to a single flow queue present in the plurality ofcalendars in the system. Typically, pointers in a calendar to do notrepresent un-initialized or empty flow queues. When a pointer to a flowqueue (or a FlowID) is present in a particular calendar in the system,the flow queue may be referred to as being “in” that particularcalendar.

[0057] Target port queues are control structures used to maintainordered lists of frames that have common port destination andpriorities. In the preferred embodiment, 2 priorities per media port (oroutput port) are provided to allow support of different classes ofservice, a so-called high priority target port queue and a so-called lowpriority target port queue. The selection of 2 priorities is a trade offbetween hardware cost and design complexity and is not intended to limitthe scope of the invention. Further, the preferred embodiment includes aseparate wrap queue 270 and a discard port queue 272.

[0058] Each of the time-based calendars 220, 230 and 250 consists of aplurality of epochs, with four shown for each in FIG. 3 as representedby the overlapping rectangles. FIG. 4 shows the four epochs 302, 304,306 and 308 along with a typical timing arrangement for the epochs wherethe first epoch 302 (labeled epocho) has a step of the scheduler tick(150 nsec in this case divided by 512 Bytes), the second epoch 304 has astep of 16 times that of the first epoch 302, with the third epoch 306having the same ratio to the second epoch 304 and the fourth epoch 308having the same ratio to the third epoch 306. In this way, the firstepoch 302 has a high priority (it is scheduled for service sixteen timesas often as the second epoch 304), creating a hierarchy of servicepriorities which will have associated increases in cost. A currentpointer (e.g., 312 for epoch 302) is associated with each epoch toprovide a pointer as to where in the queue the processing is currentlylocated. Since the arbitrary system of progressing through the epochs isto increment the current pointer, the direction of processing is fromlower to higher in the epoch. Also shown in this FIG. 4 is the currenttime 320 and a scheduler tick 330 which drives the clock 320 as well asthe priority selection.

[0059] The priority selection is an absolute priority selection, whichmeans that since only one can be serviced during any interval, the onewith the highest priority is serviced. If the current pointer in each ofthe epochs points to a calendar entry with a Flow ID, the lowest one(epocho) will be serviced. If epoch 0 requires no service (no Flow ID ispresent at that location), then epoch 1 is serviced, then epoch 2, etc.

[0060] FIGS. 5-8 illustrate the process and system of enqueuing packetsinto the scheduler system according to the preferred embodiment of thepresent invention. This method and system are described in some detailin the incorporated documents references above, particularly theScheduler Structure Patent, and the reader is referred to that patentfor a fuller description of the process of enqueuing packets.

[0061] FIGS. 9-13 illustrate the logic and calculations performed by theegress scheduler to provide an interaction between different calendarsto provide minimum bandwidth, best effort bandwidth, weighted fairqueuing service for use of the shared bandwidth of the bandwidth leftover from serving customer with minimum bandwidth, peak bandwidth andmaximum burst size bandwidth.

[0062] The present invention uses a plurality of different queues forreceiving and scheduled for output based on the type of service whichhas been associated with a given flow of processed frames. That is, foreach flow or origination of frames, a level of service type (servicelevel agreement or SLA) has been established, often as a result of apayment for a given type and level of service. Some users wish to havean assigned minimum bandwidth while others are happy with a best effortsbandwidth. Because of differences in desirability of different types ofservice (and different levels of service within a type), different costsare associated with the service and some users are willing to pay apremium for a minimum bandwidth while other users are seeking a moreeconomical service and are willing to accept a lesser service such asbest efforts bandwidth or a weighted fair queuing system. The presentinvention associates with each with each SLA and user a queue (flowqueue as discussed above) with the necessary characteristics that definethe SLA and the queue's interaction with the various calendars (eg,which calendars are used for the service, the number of calendars andthe scheduling within each calendar). The basic characteristics (minimumbandwidth, best effort WFQ, best effort peak bandwidth, and maximumburst size requires a method that allows the time based calendars andthe WFQ calendars to interact with a single flow queue to provide thedesired SLA characteristics (i.e. minimum bandwidth with best effortpeak. Other combinations are discussed in the Scheduler StructurePatent. These interactions are described below.

[0063] FIGS. 5-8 illustrates a packet enqueue when a packet is enqueuedto a flow queue. FIG. 5 illustrates service for a flow queue not in anycalendar. FIG. 6 illustrates the processing when timestamps are validbut the flow queue is not in all of the calendars it needs to be in.FIG. 7 is for a flow queue where there is no minimum bandwidth componentand the peak component is not in a calendar. FIG. 8 illustrates a flowqueue where the peak is not in the calendar but an minimum bandwidthcomponent is in a calendar.

[0064]FIG. 5 begins with a packet enqueue to a flow queue and at block1109 a list management action occurs to enqueue the frame. At block1100, where Q is in use (Qinuse=1) is tested to indicate whether theflow is in any calendar. If so, then control passes to location 0102A(see FIG. 6), if not, to block 1110 where the SSD.V is tested as equalto 0. If so, then QD is tested for 0 at block 1122 and, if it is aconfiguration error is identified, other wise a pointer to the WFQ queueis add at a location specified by a WFQ distance calculation at block1115 for flow startup and QinBlue is set to 1 at block 1116, indicatingthat this is part of a blue or WFQ calendar. If SSD.V was not zero atblock 1110, the block 1111 then the value P is used to determine whichcalendar in which to include the flow—if P=1, then at block 1112 apointer is added to the NLS calendar, if not, to the LLS calendar atblock 1113. In any case, the calendar uses the current time and at block1114 the flag QinRed is set to indicate that this is in the NLS or LLScalendar. Block 1117 follows from either block 1114 or 1116 and testswhether MBS.V (for maximum burst size service) is equal to zero. If thisis zero (for no maximum burst size service), then the processing forthis flow queue is complete at block 1121 with RR.V set to 0 and QinUseset to 1. If maximum burst size was not zero at block 1117, then thecredit is updated at block 1118 before passing to block 1121. From block1116, if PSD.V is not zero (tested at block 1119), then theNextGreenTime is set to the current time at block 1120 before passing toblock 1121.

[0065] If a flow queue was determined in block 1100 of FIG. 5 to be in aqueue, then at FIG. 6 it is determined whether the flow queue is in allof the calendars it needs to be in. If SSD.V is set to indicate thatthere is a minimum bandwidth component (No from block 1200) and QInRedis set to 1 (at block 1209), then control passes to FIG. 8 where theflow is enqueued to a calendar for best effort peak service. FIG. 7 isinvoked if SSD.V is 0 at block 1200 and PSD.V is not zero at block 1200a and QinBlue and QinGrn are both not set (blocks 1201 and 1202). Apointer to the WFQ at a location calculated using the Calculation Patentis added if PSD.V is zero (at block 1200 a) and QD is not zero,indicating a best effort, WFQ, component (at block 1203) and QinBlue isnot set (at block 204).

[0066] If QinRed is not set at block 1209) then the NextRedTime iscompared with the current time at blocks 1214 and 1211 to determine ifservice is due given the minimum level of minimum bandwidth service setfor the flow. If the NextRedTime is not later than the current time,then a pointer at the current time is added to the appropriate calendarat block 1216 or block 1212, depending on the value of P tested at block1210 (if P=1, then normal latency service (NLS) is provided, if P is notequal to 1, then low latency service (LLS) applies. Similarly, if theNextRedTime is later than the current time, at blocks 1215 and 1213 apointer is added to the appropriate calendar based on the value of P.

[0067] From blocks 1212 and 1216, the RR.V value is set to zero at block1217 and the MBSCredit value is updated at block 1222. Then at block1218, QinRed is set to 1 and at block 1221 the NextGreenTime is set tothe current time value at block 1221 if PSD.V is not zero and thecurrent time is later than the NextGreenTime.

[0068]FIG. 7 illustrates the processing where the peak component is notin a calendar and there is no minimum bandwidth component. This occursas a result of the processing in FIG. 6, the output of block 1202. TheNextGreenTime value is compared with the current time at block 1302 and,if it is later, then at block 1303, a pointer is added to the PBScalendar at the NextGreenTime (when it is eligible for service) andQinGrn flag is set. If the NextGreenTime value is not later than thecurrent time at block 1302, then this flow is eligible for service atthis time and block 1305 adds a pointer to the WFQ at a locationspecified by the WFQ distance calculation as specified by theCalculation Patent and block 1306 sets the flag for the QinBlue.

[0069]FIG. 8 illustrates the processing where the minimum bandwidthcomponent is in a calendar but the peak is not in the calendar andinvolves examining the NextGreenTime stamp to determine whether toattach this to the WFQ or the PBS calendar. If MBS (Maximum Burst Size)is specified for this flow, then the MBS credit is examined and, ifpositive, this flow is added to the flow queue of either the WFQ or thePBS calendar. This results from the test of FIG. 6 at block 1209 andinvolves tests to determine that QD is not equal to 0 (at block 1402),that the QinGrn flag and the QinBlue flag have not been set (at blocks1404 and 1405), that MBS.V is set indicating that there is a maximumburst size component at block 1413 and that the associated credit ispositive at block 1414. If so, then the flow is either added to the WFQ(at block 1411) or the PBS (at block 1409) and the appropriate flag isset at block 1412 (blue for the WFQ) or at block 1410 (green for thePBS), depending on whether the NextGreenTime is later than the currenttime (tested at block 1408). If PSD.V is 0 at block 1406, then a pointeris added to the WFQ at a location specified by the Calculation Patent atblock 1407.

[0070] FIGS. 9-11 illustrate the processing of a flow queue that hasbeen selected. The selection process is described in the SchedulerStructure Patent. The result of this process is a flow queue that is tobe serviced, the calendar that the flow queue was found, as well as thelocation it was found. Information about the packet is obtained from theflow queue itself and other control structures (also described in theScheduler Structure Patent). FIG. 9 illustrates that each calendar typeis serviced (only one per scheduler tick), with flow queues reattachedonly if there are sufficient frames in the flow queue. Flow queues addedto the weighted fair queuing (WFQ) with peak bandwidth components getthe NextGreenTime stamp updated. FIG. 10 illustrates the flow queueservice when the queue was empty (QCnt=0) at the start of service. FIG.11 illustrate Low Latency (or Low Latency Sustainable) (LLS) or NormalLatency (or Normal Latency Sustainable) (NLS) service calendar. FIG. 12illustrates the Weighted Fair Queuing (WFQ) service calendar. FIG. 13illustrates the Service PBS (or Peak Burst Service) calendar.

[0071] Starting with FIG. 9, block 500, the flow queue is examined todetermine if the queue contains any packets (Queue Count or QCnt=0). Ifthe queue is empty, the process continues at FIG. 10, block 541,otherwise the queue is not empty and the process continues with blocks512, 505 and 521.

[0072] Block 505 tests to determine if the calendar in service is atimer based LLS or NLS calendar. If so, processing continues with block1101 in FIG. 11 and is described below.

[0073] Block 521 tests to determine if the calendar in service is atimer based PBS calendar. If so, processing continues with block 1306 inFIG. 13 and is described below.

[0074] Block 512 tests to determine if the calendar in service is a WFQcalendar. If so, a packet is dequeued from the flow queue (listmanagement action) and the process continues at block 503 and tests ifthe SLA for this flow queue includes a Maximum burst size component(MBS.V is not equal to 0); if so, then the MBSCredit field of the flowqueue is updated to reflect the usage (as described in the ReconnectionPatent). The process continues at block 513 where the flow queue's framecount is tested. When the frame count is one or less (the No branch from513), additional processing is required to determine if the flow queueshould be moved to another location in the calendar, or if it should beremoved; this determination is illustrated in FIG. 12 and is discussedbelow.

[0075] When the frame count is 2 or greater (the Yes branch from 513),the flow queue is examined for a best effort peak bandwidth servicecomponent at block 514. If there is a best effort peak bandwidth servicecomponent, then processing continues at block 516 (the No branch) wherea test to determine if the flow queue is being serviced too early andwould be in violation of the peak bandwidth specification for this flowqueue. If the flow is found to be in violation (the Yes branch out of516), the flow queue is removed from the WFQ calendar (QinBlue=0 at 528)and the NextGreenTime field in the flow queue is updated for a flow inviolation as described in the Reconnection Patent (block 518).

[0076] Returning to block 528, processing continues to determine if theflow queue's peak bandwidth component is restricted by use of a maximumburst size specification (at 523) and if it is if there is any creditremaining to this flow queue (at 524). If there is no restriction, or ifthere is remaining MBSCredit, the flow queue is added to the PBScalendar at the time specified by the NextGreenTime calculation for aflow in violation as described in the Reconnection Patent (block 517).Otherwise, the flow queue is not reattached to a calendar.

[0077] Returning to block 514, if best effort peak bandwidth is notspecified for this flow queue and if there is a restriction on use ofbest effort bandwidth by use of a maximum burst size specification (Nobranch at 525), and if there is any remaining credit for the flow queue(Yes branch at 526), then the flow queue is re-attached to the WFQcalendar at the location specified by the WFQ distance calculation theQueuing Patent. Otherwise, the flow is removed from the WFQ calendar (at529) and is not re-attached to any calendar.

[0078] Returning to block 516, if the flow is not in violation of itsbest effort peak bandwidth specification (Yes branch at 516), then theNextGreenTime field of the flow queue is updated using the calculationfor a flow not in violation as described in the Reconnection Patent at520. Processing at 521 and 522 determines, as described above, if theflow is restricted by maximum burst size specifications (521) and if soto test if there is any remaining MBSCredit (522). If there is noremaining credit, then the flow is removed from the WFQ calendar (527)and is not re-attached. If there is no restriction or if there iscredit, then the flow queue is added to the WFQ calendar at the locationspecified by the WFQ distance calculation as described in the QueuingPatent (block 519).

[0079]FIG. 10 illustrates flow queue service when QCnt=0 at the start ofservice, that is, the handling of a flow queue that is found empty(QCnt=0 at block 500, from FIG. 9) when selected for service. Actionsvary dependent on the calendar, however, in all cases the flow queue isnot re-attached to any calendar. Blocks 541, 543 and 548 determine thecalendar type. Blocks 542, 544 and 549 remove the flow queue from thecalendar and clear the InUse bits. The frame count is set to 0.Additional modifications are required if the flow queue was selectedfrom a WFQ calendar and the flow queue specified a best effort peakbandwidth. If the flow queue is not being serviced too soon based on itspeak bandwidth specification (No branch out of block 546), then theNextGreenTime is marked invalid at block 547. This allows subsequentservice to the flow queue access to best effort service.

[0080] Several elements in FIG. 10 provide functions to support flowqueue aging. These are indicated by braces “{”and“}”in this figure inblocks 542, 547 and 549 NextRedValid and NextGrnValid are flags used tosupport a flow queue aging design.

[0081]FIG. 11 illustrates handling of a flow queue selected for servicefrom a timer based NLS or LLS calendar. A frame is dequeued from theselected flow queue (block 1101) and all the list management fields areupdated as described in <the Scheduler Structure Patent. The processcontinues to determine if the flow queue is to be reattached to thecalendar or removed. At block 1106, a test for a zero frame count ismade; if the frame count is zero (Yes branch from block 1106), the flowqueue is removed from the calendar (block 1110) and the NextRedTime andRedResidue fields are updated as described in the Scheduler StructurePatent and the Calculation Patent.

[0082] Returning to block 1106, if the FrameCount is not zero, then theflow queue is reattached to the calendar at the location specified bythe distance calculation (blocks 1108 for LLS and block 1109 for NLScalendar service) as described in the Scheduler Structure Patent and theCalculation Patent.

[0083]FIG. 12 illustrates service to a flow queue which was selectedfrom a WFQ calendar, and has a FrameCount of 1 or 0. At block 603, thecondition for a frame count of 0 is tested; if true, then flow queue isremoved from the WFQ calendar (block 604) and the flow queue is examinedfor a best effort peak bandwidth specification at block 605. If the flowqueue has a best effort peak bandwidth specification (No branck fromblock 605), then the NextGreenTime field is updated. The type of updateis determined by testing to see if the frame service occurred too earlyas specified by the flow NextGreenTime field (at block 606). If theframe service occurred too early (Yes branch from 608), then theNextGreenTime field is updated using the calculation for a flow inviolation; otherwise, the NextGreenTime field is updated using thecalculation for a flow not in violation (both calculations are discussedin the Reconnection Patent).

[0084] Returning to block 601, the FrameCount is tested for a value of1; if true, the flow queue is examined for a minimum bandwidthspecification. If the flow queue has a minimum bandwidth specification,processing continues at 604 and is described above.

[0085] If the flow queue does not have a minimum bandwidthspecification, then processing continues at block 514 in FIG. 9 and isdescribed above.

[0086]FIG. 13 illustrates service to a flow queue which was selectedfrom a Peak Burst Service (or PBS) calendar. A flow queue is alwaysremoved from this calendar and NextGreenTime is always recalculated. Aflow queue is placed into the Weighted Fair Queuing (WFQ) calendar onlywhen there are sufficient frames in the flow queue. If MBS is specified,a test for available credit is performed. In this case the servicedetermines if the flow queue should be re-attached to a WFQ or not. Inany case, the flow queue is not re-attached to the PBS calendar sincethe PBS calendar's purpose is to delay service of a flow queue that hasexceeded its best effort peak bandwidth specification.

[0087] Block 1306 removes the flow queue from the PBS calendar. Block1310 tests to see if the target port queue, as specified by the flowqueue, is congested. Congestion is the result of more frames beingplaced into the target port queue than can be serviced by the attachedmedia. This congestion test is the same as the one used when determiningwhich WFQs should be considered for selection as described in theScheduler Structure Patent. This test must be done here since it is notuntil the flow queue is selected can it be determined what the targetport queue is. If the target port queue is congested (Yes branch from1310), then a frame is not dequeued and processing continues at block1312, where processing determines if the flow queue may re-attached to aWFQ calendar. At 1312, the FrameCount is tested for value greater than1; if true, then the flow queue is added to the WFQ calendar (block1315) at the location specified by the distance calculation for flowstart up as described in the Scheduler Structure Patent. If theFrameCount is equal to 1 (block 1313), and the flow queue does not havea minimum bandwidth specification (Yes branch from block 1314), then theflow queue is added to the WFQ calendar (block 1315) at the locationspecified by the distance calculation for flow start up as described inthe Scheduler Structure Patent. If the frame count is 0, or if the framecount is 1 and the flow queue does have a minimum bandwidthspecification, the flow queue is not re-attached to the WFQ calendar.

[0088] Returning to block 1310, if the target port specified by the flowqueue is not congested, then a frame is dequeued from the flow queue andprocessing continues at 1304 where the NextGreenTime field is updatedusing the calculation when a flow is not in violation. Processingcontinues at block 1316 where the flow queue is examined for a maximumburst size specification, and if specified the MBSCredit of the flowqueue is updated (block 1317).

[0089] Processing continues to determine if the flow queue can bere-attached to a WFQ calendar. Blocks 1307 and 1308 determine if thereis a maximum burst size specification and if there is any MBSCreditremaining. If there is such a specification, and if there is no morecredit, then the flow queue is not re-attached; otherwise processingcontinues at 1302 where the flow queue's FrameCount is examined. If theFrameCount is 2 or greater, then the flow queue is added to the WFQcalendar (1318) at the location specified by the WFQ distancecalculation (1303). Returning to blocks 1302 and 1300, if the FrameCountis equal to 1, and there is no minimum bandwidth specification, then theflow queue is added to the WFQ calendar (1318) at the location specifiedby the WFQ distance calculation (1303). If the FrameCount is 0 or if theFrameCount is 1 and the flow queue has a minimum bandwidthspecification, then the flow queue is not re-attached.

[0090] Of course, many modifications of the present invention will beapparent to those skilled in the relevant art in view of the foregoingdescription of the preferred embodiment, taken together with theaccompanying drawings. For example, the types of service which areaccommodated are somewhat arbitrary and can be adjusted. For example, auser might have a first service during the day when there are many usersand high competition for bandwidth and a higher service or service levelat night when there may be a lower demand for service. Additionally,many modifications can be made to the system implementation and thesystem of priorities and various algorithms can be used for determiningpriority of service without departing from the spirit of the presentinvention Accordingly, the foregoing description of the preferredembodiment should be considered as merely illustrative of the principlesof the present invention and not in limitation thereof.

Having thus described the invention, what is claimed is:
 1. A system forprocessing frames and enqueuing the frames on an output where the systemserves users having different types of service, the system comprising: afirst calendar for serving users which have a first type of service; asecond calendar for serving users which have a second type of service; athird calendar for serving users having a third type of service; asystem which places frames in the first calendar when the user has afirst type of service; a system which places frames in the secondcalendar when the user has a second type of service and is within thelimits set by his level of service; a system which places frames in thethird calendar when the user has selected that type of service and whenthe user has selected the second type of service but has exceeded thelimits set for the second type of service; and a system which removesframes from the calendars according to stored logic.
 2. A system forprocessing frames and enqueuing them on an output including the elementsof claim 1 wherein one type of service is a minimum bandwidth serviceand the system includes a timer for providing periodic service to a flowwhich has a minimum bandwidth to allow the minimum bandwidth to beprovided.
 3. A system for processing frames and enqueuing them on anoutput including the elements of claim 2 wherein, when a flow which hasminimum bandwidth service exceeds the minimum bandwidth service, theexcess of the minimum bandwidth may be handled by another service.
 4. Asystem for processing frames and enqueuing them on an output includingthe elements of claim 1 wherein a service provides for a weighted fairqueuing and the system includes a mechanism which determines thepriority in the calendar.
 5. A system for processing frames andenqueuing them on an output including the elements of claim 4 whereinthe mechanism which determines the priority in a calendar includes acalculation which is based on the length of at least one frame from theflow.
 6. A system for processing frames and enqueuing them on an outputincluding the elements of claim 1 and further including a first systemfor providing minimum bandwidth service and a second system forproviding weighted fair queuing service.
 7. A system for processingframes and enqueuing them on an output including the elements of claim 1and further including a first system for providing minimum bandwidthservice and a second system for providing weighted fair queuing serviceand the system further includes a service to provide weighted fairqueuing service to a user who has minimum bandwidth service when theuser exceeds the limits of the minimum bandwidth service.
 8. A systemfor processing frames and enqueuing them on an output including theelements of claim 1 and further including a first system for providingminimum bandwidth service, a second system for providing weighted fairqueuing service and a third service which allows for best effortsservice.
 9. A system for processing frames and enqueuing them on anoutput including the elements of claim 8 wherein the weighted fairqueuing service includes a mechanism for adjusting the priority of auser according to the length of frames for that user.
 10. A method ofplacing processed frames on an output after processing and establishingand enforcing a system of different types of service levels, the methodcomprising the steps of: establishing at least a first and second typeof service, with one of the types of service having a limit on thebandwidth which can be used; identifying a type of service with eachflow of processed frames, and, for a service having a limit on thebandwidth which can be used, the respective limit; establishing alogical priority in serving the first and second types of service;allowing service for the higher priority service for a user until theuser reaches the limit on the bandwidth which can be used; serving theservice for the lower priority service when service for the higherpriority service is not required; and treating requests for service fromthe higher priority service which exceed the limit on bandwidth whichcan be used to be considered as lower priority service requests.
 11. Amethod of placing frames on the output and establishing and enforcing asystem of different types of service levels including the steps of claim10 wherein the higher priority service includes a minimum bandwidthservice up to an established bandwidth limit and a lower priorityservice is a best efforts service.
 12. A method of placing frames on theoutput and establishing and enforcing a system of different types ofservice including the steps of claim 10 and further including the stepof establishing a third type of service and allocating a priority to thethird type of service.
 13. A method of placing frames on the output andestablishing and enforcing a system of different types of serviceincluding the steps of claim 12 wherein the third type of service is afair queuing system.
 14. A method of placing frames on the output andestablishing and enforcing a system of different types of serviceincluding the steps of claim 13 wherein the third type of serviceincludes a system for weighting the priorities of different users of theservice.
 15. A method of placing frames on the output and establishingand enforcing a system of different types of service including the stepsof claim 14 wherein the third type of service includes a weighting forthe length of the frame.
 16. A method of placing frames on the outputand establishing and enforcing a system of different types of serviceincluding the steps of claim 10 wherein the steps of the method furtherincludes establishing a separate calendars for at least two separatetypes of service.
 17. A system for processing frames and enqueuing theframes on an output where the system accommodates flows with differenttypes of service including combinations of different types of service,the system comprising: a first calendar which supports a first service;a second calendar which supports a second service; logic which schedulesframes onto the output from the first calendar and the second calendar,said logic including interaction between said first and second calendarsto allow a single flow to be included on both calendars and to determinewhen the flow is enqueued on the output.
 18. A system for processingframes including the elements of claim 17 wherein the services arechosen from a group including minimum bandwidth, best effort, peak andmaximum burst size, allowing a given flow to have both a minimumbandwidth service and best effort service, wherein the system includes afirst calendar for servicing the minimum bandwidth and a second calendarfor servicing the best effort and the logic places the given flow inboth calendars to determine when it must come out, given the minimumbandwidth service and the best effort service.
 19. A method ofprocessing frames and placing the processed frames from a plurality offlows onto an output based upon different types of service levelsassociated with the flows, the steps of the methodcomprising:establishing a first calendar to support a first type ofservice; establishing a second calendar to support a second type ofservice; determining the types of service which have been selected for agiven flow and using the types of service to select the calendars whichservice the flow; using the calendars to determine the order in whichprocessed frames from the flows are placed onto the output; and allowinga single flow to be placed on the first and second calendar and servicedfrom both the first and second calendar by using logic to determine whena flow is serviced.
 20. A method of processing frames including thesteps of claim 19 wherein the types of service include minimum bandwidthand best effort with a calendar to support each type of service and thestep of determining the types of service include determining that agiven flow has both minimum bandwidth and best effort and places theflow in both the calendar for minimum bandwidth and the calendar forbest effort.
 21. A method of processing frames including the steps ofclaim 19 wherein the types of service include minimum bandwidth, besteffort, peak and maximum burst size and the services includecombinations of these types of service.